Sentiment Classification in Resource-Scarce Languages by using Label Propagation
نویسندگان
چکیده
With the advent of consumer generated media (e.g., Amazon reviews, Twitter, etc.), sentiment classification becomes a heated topic. Previous work heavily relies on a large amount of linguistic resources, which are difficult to obtain in resource-scarce languages. To overcome this problem, we investigate the usefulness of label propagation, which is a graph-based semi-supervised learning method. Extensive experimental evaluation on three real datasets demonstrated that label propagation performs more stable than support vector machines (SVMs) and transductive support vector machines (TSVMs) in a document-level sentiment classification task for resource-scarce languages (Chinese in our case).
منابع مشابه
Sentiment Classification in Under-Resourced Languages Using Graph-Based Semi-Supervised Learning Methods
In sentiment classification, conventional supervised approaches heavily rely on a large amount of linguistic resources, which are costly to obtain for under-resourced languages. To overcome this scarce resource problem, there exist several methods that exploit graph-based semisupervised learning (SSL). However, fundamental issues such as controlling label propagation, choosing the initial seeds...
متن کاملCross Lingual Sentiment Analysis using Modified BRAE
Cross-Lingual Learning provides a mechanism to adapt NLP tools available for label rich languages to achieve similar tasks for label-scarce languages. An efficient cross-lingual tool significantly reduces the cost and effort required to manually annotate data. In this paper, we use the Recursive Autoencoder architecture to develop a Cross Lingual Sentiment Analysis (CLSA) tool using sentence al...
متن کاملLearning Bilingual Sentiment Word Embeddings for Cross-language Sentiment Classification
The sentiment classification performance relies on high-quality sentiment resources. However, these resources are imbalanced in different languages. Cross-language sentiment classification (CLSC) can leverage the rich resources in one language (source language) for sentiment classification in a resource-scarce language (target language). Bilingual embeddings could eliminate the semantic gap bet...
متن کاملCross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation
In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a bilingual word graph, in which the intra-language relations among the words in the same language and the interlanguage relations among the word...
متن کاملSemi-Supervised Matrix Completion for Cross-Lingual Text Classification
Cross-lingual text classification is the task of assigning labels to observed documents in a label-scarce target language domain by using a prediction model trained with labeled documents from a label-rich source language domain. Cross-lingual text classification is popularly studied in natural language processing area to reduce the expensive manual annotation effort required in the target lang...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011